2 research outputs found

    A SEMANTICS-BASED CLUSTERING APPROACH FOR SIMILAR RESEARCH AREA DETECTION: A CASE STUDY OF NIGERIAN UNIVERSITIES

    Get PDF
    The place of research collaborations is indispensable in coming up with research publications. The task of detecting similar research areas is crucial to the development and furtherance of research. Prominent and rookie researchers alike are predisposed to seek existing research publications in a research field of interest before coming up with a thesis. The manual process of searching out individuals in an already existing research techniques which do not sufficiently capture the implicit semantics of keywords thereby leaving out some research articles. In this work, we have proposed a similar research area detection framework to address this problem. The aim of this study is to develop a semantics-based clustering method for similar research area detection. This study employs a number of techniques such as Ontology-based pre-processing, Latent Semantic.Indexing and K-Means Clustering to develop a prototype similar research area detection system, that can be used to determine similar research domain publications. However, traditional document clustering techniques suffer from high dimensionality and data sparsity problems. In a bid to solve these problems, a domain ontology is used in the preprocessing stage to weight concepts and determine semantically similar concepts, while Latent Semantic Analysis is used as the topic modelling technique in order to capture the implicit semantic relationship between terms in the text corpus. To test our framework, publications from a number of Nigerian University faculties were randomly selected and used as the dataset for our clustering model. A proof-of-concept implementation was developed using the Python programming language. From the evaluation of our system, we were able to derive more accurate clustering results as a result of the integration of ontologies in the pre-processing stage in comparison with documents that were not pre-processed with the ontology. field is cumbersome and time-consuming. Besides, it tends to not capture publications with keywords that do not match a keyword query which results in inaccurate results. From extant literature, automated similar research area detection systems have been developed to solve this problem. However, most of them use keyword matching techniques which do not sufficiently capture the implicit semantics of keywords thereby leaving out some research articles. In this work, we have proposed a similar research area detection framework to address this problem. The aim of this study is to develop a semantics-based clustering method for similar research area detection. This study employs a number of techniques such as Ontology-based pre-processing, Latent Semantic Indexing and K-Means Clustering to develop a prototype similar research area detectionsystem, that can be used to determine similar research domain publications. However, traditional document clustering techniques suffer from high dimensionality and data sparsity problems. In a bid to solve these problems, a domain ontology is used in the preprocessing stage to weight concepts and determine semantically similar concepts, while Latent Semantic Analysis is used as the topic modelling technique in order to capture the implicit semantic relationship between terms in the text corpus. To test our framework, publications from a number of Nigerian University faculties were randomly selected and used as the dataset for our clustering model. A proof-of-concept implementation was developed using the Python programming language. From the evaluation of our system, we were able to derive more accurate clustering results as a result of the integration of ontologies in the pre-processing stage in comparison with documents that were not pre-processed with the ontology

    Semantics-based clustering approach for similar research area detection

    Get PDF
    The manual process of searching out individuals in an already existing research field is cumbersome and time-consuming. Prominent and rookie researchers alike are predisposed to seek existing research publications in a research field of interest before coming up with a thesis. From extant literature, automated similar research area detection systems have been developed to solve this problem. However, most of them use keyword-matching techniques, which do not sufficiently capture the implicit semantics of keywords thereby leaving out some research articles. In this study, we propose the use of ontology-based pre-processing, Latent Semantic Indexing and K-Means Clustering to develop a prototype similar research area detection system, that can be used to determine similar research domain publications. Our proposed system solves the challenge of high dimensionality and data sparsity faced by the traditional document clustering technique. Our system is evaluated with randomly selected publications from faculties in Nigerian universities and results show that the integration of ontologies in preprocessing provides more accurate clustering results
    corecore